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Abstract 

An n-town, n e N, is a group of n buildings, each occupying a distinct 
position on a 2-dimensional integer grid. If we measure the distance between 
two buildings along the axis-parallel street grid, then an n-town has optimal 
shape if the sum of all pairwise Manhattan distances is minimized. This 
problem has been studied for cities, i.e., the limiting case of very large n. For 
cities, it is known that the optimal shape can be described by a differential 
equation, for which no closed-form solution is known. We show that optimal 
n-towns can be computed in 0(n 7 5 ) time. This is also practically useful, as 
it allows us to compute optimal solutions up to n = 80. 

Key words: Manhattan distance, average pairwise distance, integer points, 
dynamic programming 



1. Introduction 

Selecting an optimal set of locations is a fundamental problem, not just 
in real estate, but also in many areas of computer science. Typically, the 
task is to choose n sites from a given set of candidate locations; the objective 
is to pick a set that minimizes a cost function, e.g., the average distance 
between sites. As described below, there is a large variety of related results, 
motivated by different scenarios. 

In general, problems of this type are hard, even to approximate, as the 
problem of finding a clique of given size is a special case. Some of the nat- 
ural settings have a strong geometric flavor, so it is conceivable that more 
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positive results can be achieved by exploiting additional structures and prop- 
erties. However, even seemingly easy special cases are still surprisingly diffi- 
cult. Until now, there was no complexity result (positive or negative) for the 
scenario in which the candidate locations correspond to the full integer grid, 
with distances measured by the Manhattan metric (an n-town). Indeed, for 
the shape of area 1 with minimum average L\ distance (the "optimal shape of 
a city" , arising for the limit case of n approaching infinity) , no simple closed- 
form solution is known, suggesting that finding sets of n distinct grid points 
(the "optimal shapes of towns") may not be an easy task. This makes the 
problem mathematically challenging; in addition, the question of choosing n 
grid positions with minimum average L\ distance comes up naturally in grid 
computing, so the problem is of both practical and theoretical interest. 

In this paper, we give the first positive result by describing an 0(n 75 ) 
algorithm for computing sets of n distinct grid points with minimum average 
L\ distance. Our method is based on dynamic programming, and (despite 
of its relatively large exponent) for the first time allows computing optimal 
towns up to n = 80. 



1.1. Related Work 

Grid Computing. In grid computing, allocating a task requires selecting n 
processors from a given grid, and the average communication overhead corre- 
spond s to th e aver age Manhattan dista nce between processors; iMache and Lo 
(119961 . 119971 ) and iLeung et al.l (120021 ) propose various metrics for measur- 
ing the quality of a processor allocation , including the ave rage number of 
communication hops between processors. ILeung et al.l (120021 ) considered the 
problem of allocating processors on Cplant, a Sandia National Labs super- 
computer; they applied and evaluated a scheme based on space-filling curves, 
and they concluded that the average pairwise Manhattan distance between 
processors is an effective metric to optimize. 

The Continuous Versi on. Motivat e d by the problem of storing records in 
a 2-dimensional array, iKarp et al.l ( 119751 ) studied strategies that minimize 
average access time between successive queries; among other results, they 
described an optimal solution for the continuous version of our problem: 
What shape of area 1 minimizes t he average Manhatt an distance between 
two interior points? Independently, iBender et al.l (120041 ) solved this problem 
in the setting of a city, inspiring the subtitle of this paper. The optimal 
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Figure 1: Optimal towns for n = 2,..., 21. All optimal solutions are shown, up to 
symmetries; the numbers indicate the total distance between all pairs of points. The 
picture for 77 = 20 also contains the symmetry axes from Lemma |31 



solution is described by a differential equation, and no closed-form solution 
is known. 



Selecting k points out of n. iKrumke et al.l (119971 ) consider the discrete prob- 
lem of selecting a subset of k nodes from a network with n nodes to mini- 
mize their average pairwise distance. They prove a 2-approximation for met- 
ric distances and p rove hardness of approximation for arbitrary distances. 
Bender et al.l (120081 ) solve the geometric version of this problem, giving an 
efficient processor allocator for the Cplant setting described above, and a 
polynomial-time approximation scheme (PTAS) for minimizing the average 
Manhattan distance. F or the reverse problem o f maximizing the average 



Manhattan distance, see iFekete and Meijer (boost ). 



The k-median Problem. Given two sets D and F, the /c-median problem 
asks to choose a set of k points from D to minimize the average distance 
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to the points in F . For k = 1 this is the classical Fermat-Weber problem. 



Fekete et al.l (j2000l . 120051 ) considered the city-center problem: for a given city, 
find a point that minimizes the average Manhattan distance. They proved 
NP-hardness for general k and gave efficient algorithms for some special cases. 

The Quadratic Assignment Problem. Our problem is a special case of the 
quadratic assignment problem (QAP): Given n facilities, n locations, a ma- 
trix containing the amount of flow between any pair of facilities, and a matrix 
containing the distances between any pair of locations. The task is to assign 
every facility to a location such that the cost function which is proportional 
to the flow between the facilities mult iplied by the distan ces between the lo- 
cations is minimized. For a survey see lLoiola et al.l (120071 ) . The cost function 
in our problem and in the QAP are the same if we define the distances as 
the Manhattan distances between grid points and if we define all flows to 
be one. The QAP c an not be approxima t ed wi th in any polyn o mial factor 
unless P = NP; see ISahni and Gonzalez! (119761 ). lHassin et al.l (120091 ) con- 
sidered the metric version of this problem with the flow matrix being a 0/ 1 
incidence matrix of a graph. They state some inapproximability results as 
well as a constant-factor approximation for the case in which the graph has 
vertex degree two for all but one vertex. 

The Maximum Dispersion Problem. The reverse version of the discrete prob- 
lem, where the goal is to maximize the average distance between points, has 
also been studied: In the maximization version, called the maximum dis- 
persion problem, the objective is to pick k points from a set of size n so 
that the pairwise distance is maximized. When t he ed ge weights need not 
obey the triang le inequality. iKortsarz and Pelea (119931 ) give an 0(n 0,3885 )- 
approximation. Asahiro et al. ( 2000l ) impro ye this guarantee to a constant 
factor in the special case when k = Q(n) and lArora et al.l (119991 ) give a PTAS 
when \E\ = Q(n 2 ) and k = Q(n). 



When the edge weights obey the triangle inequalit y. iRavi et al. ( 
give a 4- approximation that runs in 0(n 2 ) time and lHassin et al. 



1994) 



(1997) 



give a 2-approximation that r uns in Q(n 2 + k 2 lo g k) time. For points in the 
plane and Euclidean distances, Ravi et aD ( 1994 ) give an approximation with 
performance bound arbit rarily close to tt/2 « 1.57. For Manhattan distances, 
Fekete and Meijerl ( 120031 ) give an optimal algorithm for fixed k and a PTAS 
for general k. Moreover, they provide a (v2+ e) -Approximation for Euclidean 
distances. 
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The Min-Sum k-Clustering Problem. Another related problem is called min- 
sum k-clustering or minimum k-clustering sum. The goal is to separate a 
graph into k clusters to minimize the sum of pairwise distances b e tween 
nodes in the same cluster. For general graphs, ISahni and Gonzalez ( 1976 ) 
show that this problem is NP-hard to approximate within any constant fac- 
tor for k > 3. In a metri c spac e the problem is easier to appro x imate : 



Gut tmann-Beck and Hassin (1998) give a 2 -appr oximation, llndykl fll999h 



gives a PTAS for k = 2, and iBartal et all (120011 ) give an 0(l/elog 
approximation for general k. 



1.2. Our Results 

We solve the ft-town problem with an 0(ft 7 5 )-time dynamic-programming 
algorithm. Our algorithm is based on some properties of an optimal town: an 
optimal ft-town is convex in the sense that it contains all grid points within 
its convex hull (Lemma [2]). It lies symmetric with respect to a horizontal and 
a vertical symmetry axis within a tolerance of ±1 that is due to parity issues 
(Lemma [3D- Furthermore, it fits in an 0{\fn) x 0(y/n) square (Lemma|5f. 
We also present computational results and discuss the relation between the 
optimum continuous cities and their discretized counterparts (ft-towns, and 
ft-block cities). 



2. Properties of Optimal Towns 

We want to find a set of n distinct points from the integer grid Z x Z 
such that the sum of all pairwise Manhattan distances is minimized. A set 
S C Z x Z of cardinality n is an n-town. An ft-town S is optimal if its cost 

^):=^££||*-t||i (1) 

ses teS 

is minimum. Figure [U shows solutions for small n and their cost, and Tabled] 
in Section |5] shows optimal cost values c t0 wn(^) for n < 80. We define the 
x-cost c x (S) as J2{st}esxs \ Sx ~ ^1, where s x is the x-coordinate of s; y-cost 
c y (S) is the sum of all ^-distances, and c(S) = c x (S) + c y (S). For two sets 
S and S', we define c(S, S') = J2{ s s '}esxs> II s — S '\U- ^ & cons ists of a single 
point t, we write c(t,S') instead of c({t},S') for convenience. A town S is 
convex if the set of grid points in the convex hull of S equals S. 



5 



In proving various properties of optimal towns, we will often make a local 
modification by moving a point t of a town to a different place r. The next 
lemma expresses the resulting cost change. 



Lemma 1. Let S be a town, t G S and r ^ S . Then, 
c((S \ t) U r) = c(S) - c{t, S) + c(r, S) - 



r — t\\i . 



Proof. Let p be a point in 5*. Then, its distance to t is \\t — p\\i and 
the distance to r is ||r — Hence, the change in the cost function is 

ll r ~ p\\i ~ \\t—p\\i- We need to subtract \\r — t\\i from the sum over all points 



Lemma 2. An optimal n-town is convex. 

The following proof holds in any dimension and with any norm for mea- 
suring the distance between points. 

Proof. We prove that a nonconvex n-town S cannot be optimal. Take a grid 
point x ^ 5* in the convex hull of S. Then there are points xi, X2, ...,itG5 
such that x = XiXi + A 2 x 2 + • • ■ + A^x^ for some Ai, X 2 , ■ ■ ■ ,Xk > with 
^2 Xi = 1. Because every norm is a convex function, and the sum of convex 
functions is again convex, the function fs(x) = c(x,S) = J2 s es \\ x ~ s \\ 1 ^ s 
convex. Therefore, 



Obviously, if we translate every point from an n-town by the same vector, 
the cost of the town does not change. We want to distinguish towns because 
of their shape and not because of their position inside the grid and, there- 
fore, we will only consider optimal towns that are placed around the origin. 
Lemma E] makes this more precise: an optimal n-town is roughly symmetric 
with respect to a vertical and a horizontal symmetry line, see Fig. |2] for an 
illustration. Perfect symmetry is not possible since some rows or columns 
may have odd length and others even length. 

We need some notation before: For an n-town S, the i-th column of 
S is the set Cj = { (i, y) G S : y G Z } and the i-th row of S is the set 
Ri = {(x,i) E S : x EZ}. 



in S because t is removed from S. 



□ 




□ 
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Lemma 3 (Symmetry). In every optimal n-town S, the centers of all rows 
of odd length lie on a common vertical grid line V a . The centers of all rows 
of even length lie on a common line V e that has distance | from V a . A 
corresponding statement holds for the centers of odd and even columns that 
lie on horizontal lines H Q and H e of distance | . Moreover, without changing 
its cost, we can place S such that H Q and V Q are mapped onto the x-axis and 
y-axis, respectively, and H e and V e lie in the negative halfplanes. 

Proof. For a row Rj and r e Z, let Rj + (r, 0) be the row Rj horizontally 
translated by (r, 0). If two rows Ri and Rj are of the same parity, a straight- 
forward calculation (using Lemma U} shows that the cost c(Ri,Rj + (r, 0)) 
is minimal if and only if the centers of Ri and Rj + (r, 0) have the same re- 
coordinate. If the parities differ, c(Ri,Rj + (r, 0)) is minimized with centers 
having x-coordinates of distance 1/2. The total cost is 

c{S) = c x (S) + c y (S) = ^2 ^2 \ s x ~ t x \ + c y (S) 

i,j seRi,t£Rj 

If we translate every row R { of S horizontally by some (r iy 0), c y (S) does not 
change. The solutions that minimize ^2 s€R . teR . \ s x + r j — t x + r 3 -| for all i, j 
simultaneously are exactly those that align the centers of all rows of even 
length on a vertical line V e and the centers of all rows of odd length on a 
vertical line V at offset | from V e . The existence of the lines H and H e 
follows analogously. 

We can translate 5* such that H and V are mapped onto the x- and the 
y-axis and rotate it by a multiple of 90° degrees such that H e and V e lie in 
the negative halfplanes. These operations do not change c(S). □ 

From the convexity statement in Lemma |5] (together with Lemma [3]) we 
know that Co is the largest column, and the column lengths decrease to both 
sides, and similarly for the rows. Our algorithm will only be based on this 
weaker property (orthogonal convexity); it will not make use of convexity per 
se. We will, however, use convexity one more time to prove that the lengths 
of the columns are 0(y/n), in order to reduce the running time. 

In the following we assume the symmetry property of the last lemma. 
For an n-town S, let the width of 5* be w(S) = max ieZ and the height of 
S be h(S) = max ieZ \Cj\. We will now show that the width and the height 
cannot differ by more than a factor of 2. Together with convexity, this will 
imply that they are bounded by 0(y/n) (Lemma [5]). 
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Lemma 4. For every optimal n-town S, 

w{3) > h{S)/2 - 3 and h{S) > w{S)/2 - 3. 

Proof. Let S be an n-town, w = w(S), and h = h(S), and assume w < 
h/2 — 3. Let t = (0,1) be the topmost and (k,0) be the rightmost point 
of S, with I = LVJ and k = L^J- Let r = (k + 1,0). We show that 
c((S \ t) U r) < c(S), and thus, S is not optimal. By Lemma [U the change 
in cost is c(r, S) — c(t, S) — \k + I + 1|. We show that c(r, S 1 ) — c(t, S) < 
by calculating this difference column by column. This proves then that 
replacing t with r yields a gain of at least \l + k + 1| > 1, and we are done. 
Let us calculate the difference c(r, Cj) — c(t, Cj) for a column Cj of height 
\Cj\ = s<h: 

c(r,C j )-c(t,C j )= J2 (\i\-(l-i)) + s(k + l-j-\j\) 

= ^(<-(i-i))+ ^(<-(/ + <)) + a (A ; + l-j-|j|) 

i=0 i=l 

= 2 £\-_ s Z + s (A;+l-j-|j|) 

i=0 

< - + S (jfc + 1) < I - S ^ + , . H±i 

= f (s - 2/i + 2w + 6) < f (-h + 2w + 6) < 

□ 

Lemma 5. For every optimal n-town we have 

m&x{w(S), h(S)} < 2y/E + 5. 

Proof. Let w = w(S) and h = h(S). Assume without loss of generality 
that h > w. We know from Lemma H] that w > h/2 — 3. By Lemma [31 
we choose a topmost, a rightmost, a bottommost, and a leftmost point of 
S such that the convex hull of these four points is a quadrilateral with a 
vertical and horizontal diagonal, approximately diamond-shaped. Let H be 
the set of all grid points contained in this quadrilateral. The area of the 
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Figure 2: The lines V a , V e , H Q , and H e from Lemma [3] The rectangle R w and the set of 
points above and below it with cardinality U w and D w , respectively. The gray points are 
the corner points of R w . In this example, the height c w of column w is set to c = 4. 



quadrilateral equals (w — l)(h — l)/2, and its boundary contains at least 
4 grid points. Pick's theorem says that the area of a simple grid polygon 
equals the number of its interior grid points Hi plus half of the number of 
the grid points Ho on its boundary minus 1. This implies \H\ = \Hi\ + \ Hq\ = 
(|if;| + |#o|/2-l) + |#o|/2 + l > (>-l)0-l)/2 + 3. Because of LemmaEl 
all points in H belong to S. Since H consists of at most n points, we have 

n>\H\>(w- l)(h - l)/2 + 3 > (h/2 - A)(h - l)/2 + 3 

Solving the equation h 2 — 9h + 20 — An = shows that 

h < 2 y/n + 1/16 + 9/2 < 2^ + 5. □ 



3. Computing Optimal Solutions 

We will now describe a dynamic-programming algorithm for computing 



optimal towns. A program for this algorithm is listed in Appendix A 



We denote by q = \Ci\ the number of selected points in column i and by 
cf and c~ the row index of the topmost and bottommost selected point in 
Ci, respectively. We have Ci = cf — c~ + 1; see Fig. |2j 
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Lemma 6. Let S be an optimal n-town (placed as described in Lemma Ej) 
containing the points (i, cf) and (i, c~), for i > 0. Then all points inside the 
rectangle x [c~,c+] belong to S. 

Similarly, if S contains the points (— i, clj and (— i, cZj ; /or z > 1, i/iera 
z't contains all points in the rectangle — 1] x [cl^ctj. 

Proof. If (z,c+) and (i,c~) are contained in S then, by Lemma El (— i,cf) 
and (— z',c~) belong to S as well. By Lemma |2] all points inside the convex 
hull of these four points are contained in S. The same arguments hold for 
the second rectangle. □ 

Now we describe the dynamic program. It starts with the initial empty 
grid and chooses new columns alternating from the set of columns with non- 
negative and with negative column index, i.e., in the order 0, —1, 1, —2, 2, . . .. 
Let w > be the index of the currently chosen column and fix c w to a 
value c. We describe the dynamic program for columns with nonnegative 
index; columns with negative index are handled similarly. (In the program 
that is described in the appendix, we use a trick to avoid dealing with nega- 
tive columns: they are mapped to columns with positive index by reflecting 
everything at the y-axis, with a proper adjustment to take into account that 
the placement of Lemma [3] is not invariant under this transformation.) 

We know from Lemma [6] that in every optimal solution, every point inside 
the rectangle R w = [— w,w] x [c~,c+] is selected. We define 

cost( W , c, A|f , A° R , A^ L , A° l , U w , D w ) 

as the minimum cost of a town with columns — w, . . . , w of height Q > c for 
— w < i < w and c w = c where U w points lie above the rectangle R w , having 
a total distance A^ L and A^ R to the upper-left and upper-right corner of 
R w , respectively, and D w points lie below R w , having a total distance A° L 
and A° R to the lower-left and lower-right corner of R w . For a given n, we are 
looking for the n-town with minimum cost where (2w + l)c + U W + D w = n. 
Next we show that cost(u>, c, A^J R , A° R , A^ L , A° L , U W ,D W ) can be computed 
recursively. 

Consider the current column w with c w = c. The cost from all points in 
this column to all points above R w , in R w , and below R w can be expressed 
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as 

E( A ™+( c+ - A; )- f/ -)+ E E Ek^-o + i*-^] 

fc=c~ i=—wj=c~ k=c~ 

c+ 

+ E(^ r +(*- c ")-p«d- 

fe=c — 

We can transform this into 
c ■ (A™ + Ajf + £/„.£:+-£>„■ c") + c - ■ - t/ tt ) • ((c + 1) mod 2) 

+ (cW^) .(2w + l)- C -^, (2) 

which, obviously, depends only on the parameters w, c, A™, A° R , and 
D„ (the two parameters Aj^ L , A° L are needed if we consider a column with 
negative index). We denote the expression (J2]) by dist(u>, c, A^ R , A° R , Aj^ L , 
A° L , ?7 W , D w ) and state the recursion for the cost function: 

cost(w,c, A™, . . . , A° L , U w , D w ) 

= min {cost(-w, c_ w , A™, • • • , A^, 

c_„>c 

+ dist(u;,c,AU R ,...,A° L ,^,Dj (3) 

By Lemma |6] it suffices to consider only previous solutions with c_ w > c. In 
the step before, we considered the rectangle R_ w = [—w, w — 1] x [at w , cZ w )- 
Hence, the parameters with index — w can be computed from the parameters 
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with index w as follows: 



D 



U. 



—w 



—w 



U w -2w(ct w -c + ), 
D w -2w- (c~ - cZ w ), 



c 



,+ 



w 




E E [(^ - o + (j - c + )] 



[U w - U. w ] • (c 



c 




-EE [(™-i)+(c--j)] 



[D w -D_ w )-(c -cZ w + l). 



The parameters A^ and A°^ can be computed analogously and the cost 
function is initialized as follows: 



The bound on c has been shown in Lemma [51 

Theorem 7. An optimal n-town can be computed by dynamic programming 



Proof. We have to fill an eight-dimensional array cost(w, c, A UR , A DR , A UL , 
A DL ,U, D). Let C m ax denote the maximum number of occupied rows and 
columns in an optimum solution. By LemmaO, we know that C max = 0(y/n). 

The indices w and c range over an interval of size C max = 0(y/n). Let us 
consider a solution for some fixed w and c. The parameters U and D range 
between and n. However, we can restrict the difference between U and D 
that we have to consider: If we reflect the rectangle R = [— w,w] x [c~,c + ] 
about its horizontal symmetry axis, the U points above R and the D points 
below R will not match exactly, but in each column, they differ by at most 
one point, according to Lemma [31 It follows that \U — D\ < C max = 0(y/n). 
(If the difference is larger, such a solution can never lead to an optimal n- 
town, and hence we need not explore those choices.) In total, we have to 
consider only 0(n ■ *Jn) = 0(n 3 ^ 2 ) pairs (U, D). 



cost(0,c,0,0,0,0,0,0) 




if < c < + 5, 



otherwise. 



in 0(n 15 / 2 ) time. 
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The same argument helps to reduce the number of quadruples (A UL , A UR , 
A DL , A DR ). Each A- variable can range between and n ■ 2C max = 0(n 3 / 2 ). 
However, when reflecting around the horizontal symmetry axis of R, each 
of the at most D max differing points contributes at most 2C max = 0(\/n) 
to the difference between the distance sums A UL and A DL . Thus we have 
| A ul _ a dl| < Cmax . 2Cmax = Q(n), and similarly, |A UR - A DR | = 0(n). 

By a similar argument, reflecting about the vertical symmetry axis of 
R, we conclude that |A UL - A UR | = 0(n) and |A DL - A DR | = 0(n). In 
summary, the total number of quadruples (A UL , A UR , A DL , A DR ) that the 
algorithm has to consider is 0(n 3 ^ 2 ) ■ 0(n) ■ 0(n) ■ 0(n) = 0(n 9 ^ 2 ). In total, 
the algorithm processes 0(y/n) ■ 0(yjn) ■ 0{m?/ 2 ) ■ 0(n 9 ^ 2 ) = 0(n 7 ) 8-tuples. 
For each 8-tuple, the recursion (jSJ has to consider at most C max = 0(y/n) 
values C- w , for a total running time of 0(n 15 / 2 ). □ 



4. n- Towns, Cities, and n-Block Cities 

For large values of n, n-towns converge towards the continuous weight 
distributions of cities. However, the arrangement of buildings in many cities 
are discretized in a different sense: An n-block city is the union of n axis- 
aligned unit squares ("city blocks"), see Fig. [3] below. In the following, we 
discuss n-block cities, and we discuss the relation between n-towns and n- 
block cities. 

For a planar region R, let c'(R) be the integral of Manhattan distances 
between all point pairs in R: 



c'(R) := / / \\p — q\\ dpdq 

Jp&R JqGR 



IpeR Jg€R 

When R has area 1, this is the expected distance between two random points 
in R. Scaling a shape R by a factor of d increases the total co st by a factor 



of d 5 , i.e., by a factor of A 2 ' 5 for an area of A. This motivated iBender et al 



( 12004J ) to use the expression D(R) := j^ps as a scale-independent measure 



for the quality of the shape of a city. For example, a square Q of any side 
length a gets the same value 

1 / 2 ' ' 1 1,, oil, ,,,\ ^ 



D(Q) = — J J \xi - x 2 \ dx 2 dxi + a J J \y x - y 2 \ dy 2 dy x . :j 

A circle C yields D(C) = J* 2 , 5 ~ 0.6504 and the optimal shape achieves a 
value of V = 0.650 245 952 951 .. . 
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We will consider n-block cities Q(S) consisting of unit squares ("blocks") 
centered at a set of n grid points S C Z x Z. We denote a unit square 
centered at point s — (s x , s y ) by Q(s) = [s x — |, s x + »] x [s^ — |, s y + |], and 
then we have 

g(5) := |J Q(s). 
ses 

The average distance D(Q(S)) of an n-block city can be decomposed into 
average distances between blocks: 

c'(Q(S)) = J J \\p- q\\idpdq = ^2^2 J J \\p-q\\ x dpdq 

peQ(S) gGQ(S) seS tGS p£Q{s) q£Q(t) 

Using the notation 



d'(s) := / \\p- q\\idpdq, 

JpeQ(s) JqeQ(o) 

we can express this as 

c'(Q(S)) = J2J2 d '( s - t y ( 4 ) 

ses tes 

The average distance d'(s) between two square blocks at an offset s can be 
expressed as follows: If the two blocks don't lie in the same row or col- 
umn (s x 7^ and s y ^ 0), the average distance is simply the distance ||s||i 
between the centers, since positive and negative deviations from the block 
centers average out. When two blocks lie in the same column, then the y- 
distances average out to the y-distance between the centers, but the average 
x-distance is not the x-distance between the centers (which would be 0), but 
I-1/2 f-i/2 \ Xl ~ X2 \ dx 2 dxi = ~. Similarly, for two blocks in the same row, 
we must add | to the distance ||s||i between the centers. Finally, for two 
identical blocks, we have already seen that the average distance is |. We can 
express this compactly as 

d'(s) = d'{{s x , s y )) = \s x \ + l[s x jt 0] + |s„| + \[s y ^ 0] 
= \\s\\i + i[s x ^0] + l[s v ^0], 

where the notation [s x ^ 0] is 1 if the predicate s x ^ holds and otherwise. 
With these conventions, the expression (J3J) for the cost c'(Q(S)) of an n-block 
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city Q(S) looks very similar to (JTJ) for the cost c(S) of a town S, except for 
the correction terms | in the summands and for the factor |. The factor | 
accounts for the fact that in the sum c(S) of a town, each pair of (distinct) 
points is counted once, whereas in the integral c'(R), each pair of points is 
considered twice, as two ordered pairs. To make these expressions better 
comparable, we introduce the factor | and define 

c city (S) :=^V(Q(S0) = ^ 

ses tes 

The new "distance" d! is not a norm (for example, d'(0) 7^ 0), but the 
properties from Section [2] remain true for this new objective function. Thus, 
the dynamic programming formulation of Section [3] can be adapted to com- 
pute optimal n-block cities. As we shall now demonstrate, all lemmas of 
Section [2] hold verbatim for n-block cities, except that the formula for the 
change incurred by moving a single block to a different place (Lemma [1]) 
must be adapted: 

Lemma [TJ. Let S C 1? be the set of centers of an n-block city, t G S and 
r £ S. Then, 

c c it y {{S \t)Ur) = c city (S) - c city (t, S) + c city (r, S) - d'(r - t), 
where c city (p, S) := ^ q&s d'(p - q). 

The proof is the same as for Lemma [TJ One can easily check that the 
distance d' from t to itself and the distance from r to itself are correctly 
accounted for. □ 

Convexity of the optimum solution (Lemma |2]) holds true for n-block 
cities. The proof goes through almost verbatim. The expression d'(p) is 
not a norm, but it is a convex function of p, and this is all that is needed. 
Approximate symmetry of the optimal solution (Lemma|3]) remains true. The 
calculations (which have not been shown in detail anyway), are modified, but 
the conclusion is the same. 

When comparing an n-town S and a corresponding n-block city Q(S), 
we have to add 1/6 for each pair of blocks that are in the same column, and 
for each pair of blocks that are in the same row. Using the notation and 
Ci of Section [3] for row and column lengths, and writing Ctown(S') instead of 
c(S) for improved clarity, we get thus: 

c aty (S) - c town (S) = A(S) := |(E, 4 + J2j r?) (5) 
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This adjustment term accounts for the discretization effect. For example, a 
1-town has an average distance of 0, as all the weight is concentrated in a 
single point, while a 1-block city has an average distance of 2/3, just like any 
other square, (and a c C i ty value of 1/3). 

Using this expression, it is easy to show that the bound of 2 on the aspect 
ratio holds for n-block cities in the same form as in Lemma HI The proof of 
Lemma S] establishes that Ct own (S) decreases when a top- most point t from 
the longest column of height h is added to the right of the longest row of 
length w. For an n-block city, the adjustment term A(S) decreases by at 
least h 2 — (h — l) 2 when t is removed, and it increases by (w + l) 2 — w 2 + l 2 
when r is added. Under the assumption of the proof (w < h/2 — 3), the total 
change of A is negative, and the modified solution is an improvement also 
when 5* is regarded as an n-block city. 

From LemmaH]we conclude that the bound of 2^/n + 5 on the height and 
width of optimal n-block cities (Lemma |5]) holds as well. (The argument of 
the proof of Lemma |5] is purely geometric: it is based on convexity and does 
not use the objective function.) 

Thus, we conclude that the adjustment term (|5]) is asymptotically bounded 
by A(S) = 6(n L5 ). 

When considering the continuous weight distributions of n-block cities, we 
have to account twice for each pair of discrete block centers; hence, the appro- 
priate measure for the quality of an n-town is &{S) = 2c(S)/n 2 ' 5 . The mea- 
sure for the corresponding n-block city is ^(S) = 2(c t0 wn(S') + A(S))/n 2 ' 5 . 
Thus, the relative difference is ©(-)■ 

In Figure [3], we show the corresponding values for the small examples from 
Figure [TJ Figure [5] shows results for some larger values n. One can observe 
how and ^{S) converge from below and above towards the optimal city 
value ip of about 0.650 245 952 951. Note that convergence is not monotone, 
neither for $ nor for 

5. Computational Results 

Optimal n-towns and optimal n-block cities do not necessarily have the 
same shape. For n < 21, a comparison of Figures [U and [3] shows that, while 
not all optimal n-towns are optimal n-block cities, every optimal n-block city 
is simultaneously an optimal n-town. However, this is not always true. In 
fact, for n = 72, there are two shapes for optimal n-block cities, none of 
which is optimal for an n-town, see Figure HI This is the only instance of this 
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« = 
* = 



0.3536 * 
0.7071 * 



0.5132 <S> 
0.727 * 



0.5 * 
0.6667 * 



0.5724 <S> 
0.7036 * 



0.5862 
0.6788 



$ = 
* = 



0.567 « 
0.6804 * 



0.5966 « = 0.5925 * 
0.6776 * = 0.6667 * 



0.6072 * 
0.6725 * 



0.618 
0.6761 



« = 
* = 



0.6094 * 
0.6629 * 



« = 
* = 



0.6277 * 
0.6652 * 



0.6171 « = 
0.6663 * = 



0.63 « = 
0.6654 * = 



0.6191 
0.6654 



0.6304 
0.6639 



« = 
* = 



« = 
* = 



0.6243 
0.6671 



0.6295 
0.6615 



* = 0.6211 

* = 0.6616 



* = 0.6255 

* = 0.6561 



Figure 3: Values of $ and *f> for optimal n-block cities for n — 2, . . . , 21. All optimal 
solutions are shown, up to symmetries. These are simultaneously shapes of optimal n- 
towns, but for n = 3, 11, 15, 17, 18, 19, there are additional n-towns that are tied for the 
optimum (with the same value "J), cf. Figure [TJ 



phenomenon up to n = 80, but we surmise it will be more and more frequent 
for larger n. We have more comments on this phenomenon at the end of this 
section. 

We have calculated the optimal costs c town and c C i ty up to n = 80 points. 
The results are shown in Table [TJ When there are several optimal solutions 
(except symmetries), this is indicated by a star, together with the multiplic- 
ity. 

It is clear that an optimal n-block city is never better than an optimal 
continuous city of area n that has a value ipn 5 ^ 2 /2. Empirically we found 
the approximation c city w ipn 5 / 2 /2 + 0.115 • n 3//2 with ip = 0.650245952951. 
The order of magnitude of the "discretization penalty" 0.115 ■ r?l 2 = Q(n 3 ^ 2 ) 
is explained as follows: changing the continuous city of area n into blocks 
affects Q{^/n) squares along the boundary. For each adjustment in one of 
these squares, distances to n other squares are affected. 

Since c city is a multiple of 1/3, we rounded our estimate to the nearest 
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Tabic 1: Optimal towns c toW n and n-block cities c c i ty . * indicates multiple solutions. 
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Figure 4: The optimal n-town and the two optimal n-block cities for n = 72. 



multiple of 1/3 and used the approximation formula 

c city := [3ipn 5/2 /2 + 0.345 ■ n 3/2 ]/3 

The notation [•] denotes rounding to the nearest integer. Table [D shows the 
error E 2 := c C i ty — c c ; ty of this approximation. (Actually, the table shows 3E 2 , 
which is an integer.) 

For n-towns, on the other hand, we found the approximation c town ~ 
^ n 5/2/ 2 - 0.205 • n 3 / 2 . So this seems to approximate the optimal continuous 
city from below, but we do not have a proof of this fact. 

The deviation E% between c t0 wn and its approximation formula 

ctown := hM 5/ 72 - 0.205 • n 3 ' 2 ] 

is shown in Table [TJ 

Finally, we look at the difference between c city and c t0 wn- For a given 
point set S, it is the quantity A defined in ([5]). It is estimated as 0.32 • n 3 ^ 2 . 
The table shows the error E 3 := 3 ■ (c C i ty — c t0 wn) — [0-96 • n 3 / 2 ] . Apart from 
the rounding, E 3 would equal 3(E 2 — Ex). 

One can see that the error E 3 is much smaller than one might expect 
from the random- looking fluctuations of Ei and E 2 . This can be explained 
by the fact that the expression (|5]) for A(S) is apparently not so sensitive to 
small deviations of the shape S. 

Accordingly, Table [1] exhibits the tendency that the deviations of c t0 wn 
and c C ity "above" and "below average" occur for the same values of n: n-towns 
and n-block cities with the same number n suffer equally from the effects of 
discretization. A glance at the optimal solutions (Figures [T] and |3]) shows 
that the costs are below average when the shapes are highly symmetric, for 
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n = 58, c(S) = 8243 
$ = 0.6434 
* = 0.6546 



n = 59, c{S) = 8604 
$ = 0.6436 
$ = 0.6545 



n = 60, c(S) = 8968 
$ = 0.6432 
* = 0.6539 



Figure 5: Optimal n-towns for n = 58, 59, 60; these are simultaneously the shapes of the 
optimal n-block cities. 



example n = 9, 12, 21, but also n = 60 (FigureEJ). On the other hand, when 
there is no unique "very good" shape, one can expect a greater variation of 
different solutions that try to come close to the optimal continuous shape. 
Indeed, larger values of E\ and E 2 in Table [1] tend to go hand in hand with 
a greater multiplicity of optimal solutions. The worst values of E\ and E 2 
occur for n = 72; this is the first value of n where optimal n-block cities and 
optimal n-towns differ (Figure H]). This is probably no coincidence: when 
there is a greater variety of solutions that can compete for being best, the 
distinction of the objective function between n-block cities and n-towns is 
more likely to make a difference. 

6. Outlook 

We have shown that optimal n-towns can be computed in time 0(n 7 ' 5 ). 
This is of both theoretical and practical interest, as it yields a method poly- 
nomial in n that also allows extending the limits of the best known solutions; 
however, there are still some ways how the result could be improved. 

Strictly speaking, the method is only pseudo-polynomial, as the input size 
is O(logn). It is not clear how the corresponding output could be described 
in polylogarithmic space; any compact encoding would lead to a good and 
compact approximation of the optimal (continuous) city curve, for which 
there is only a description by a differential equation with no known closed- 
form solution. For this reason we are sceptical that a polynomial solution is 
possible. 
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We are more optimistic about lowering the number of parameters in our 
dynamic program, and thus the exponent, by exploiting convexity or stronger 
symmetry properties. This may also make it possible to compute optimal 
solutions for larger n. One possible avenue could arise if partial solutions 
would satisfy some monotonicity property; however, the unique optimal 9- 
town is not contained in the unique optimal 12-town. Thus, there is no way 
for a town to organically grow and remain optimal at all times. Generally, an 
optimal n-town does not necessarily contain an optimal (n — l)-town. (The 
smallest example occurs for n = 35, no optimal 35-town contains an optimal 
34-town.) 

As discussed in the last section, there is still a variety of questions re- 
garding the convergence of optimal solutions for growing n, approaching the 
continuous solution in the limit. As indicated, we have a pretty good idea 
how this continuous value is approximated from below and above by n-towns 
and n-block cities; however, we do not have a formal proof of the lower bound 
property of n-block cities. 

It is easy to come up with good and fast approximation methods: In the 
continuous case, even a square is within 2.5% of the optimal shape; a circle 
reaches 0.02%; consequently, simple greedy heuristics will do very well. Two 
possible choices are iteratively adding points to minimize the total cost, or 
(even faster) as close as possible to a chosen center. 

As mentioned in the introduction, a closely related, but harder problem 
arises when n locations are to be chosen from a given set of k > n p oints, 
instead of the full integer grid. This was studied by lBender et al.l (120081 ). who 
gave a PTAS, but were unable to decide the complexity. It is conceivable that 
a refined dynamic-programming approach may yield a polynomial solution; 
however, details can expected to be more involved, so we leave this for future 
work. The same holds for other metrics. 

Finally, one can consider the problem in higher dimensions. A crucial 
property of our dynamic-programming solution is that the interface between 
the points in the columns that have already been constructed and the points 
to be added in the future can be characterized by a few parameters. A 
similar property does not hold in three dimensions, and therefore one cannot 
extend our dynamic-programming approach to higher dimensions. For the 
same reason, the Euclidean distance version cannot be solved by our method, 
since, unlike in the Manhattan case, the effect of the U w points on the upper 
side of the current rectangle on the distance to points that are inserted in the 
future cannot be summarized in the parameters A UR and A UL . Moreover, 



21 



in higher dimensions, there is no known solution for the continuous case; the 
corresponding calculus-of-variations problem will be harder to solve than in 
two dimensions. 

Acknowledgements. We thank the reviewers for their careful reading and 
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Appendix A. Program for Computing Optimal Towns 

Figure El shows a short program in the programming language PYTHON 
that implements our algorithm. The program calculates and prints the costs 
of optimal n-towns for all values of n up to the specified limit n = n_target. 
Instead of an 8-dimensional array, the costs are stored as a dictionary in 
the variable cost_array [w , cc] [D_up_right , D_down_r ight , D_up_lef t , 
D_down_left, n_up, n_down] . This makes the program a lot simpler, since 
we don't have to worry about allocating arrays with explicit limits, and incurs 
little overhead, since internally, Python dictionaries are implemented as 
hash tables, providing constant expected access time. 

Instead of adding rows alternately on the left and on the right, the pro- 
gram always adds a new row on the left side, but (implicitly) reflects the 
town about the y-axis when storing a cost value, achieving the same effect. 

The main loop of the program does not use the recursion in the form ([3]) , 
which calculates the optimum cost of a configuration from all partial solution 
that lead to it when a column is added. Instead, it makes a "forward" trans- 
fer, generating all successor configurations of a given configuration. This has 
the advantage that certain "impossible" parameter sets are automatically ex- 
cluded. For example, in the running time analysis for Theorem we argued 
that parameter pairs U, D with \U — D\ > C max need not be considered. (The 
parameters U and D correspond to the variables n_up and n_down.) Since 
the program only adds columns which are (approximately) balanced about 
the x-axis, it will never generate solutions with such parameters. 

The program can be adapted for computing optimal n-block cities. Then 
the additional cost A from (jSJ) between blocks in the same row or column 
must be taken into account. One simply has to extend the last line in the 
computation of new_cost: 

c*c * w*(w+l)/2 ) 

to 

c*c * w*(w+l)/2 ) * 6 + c*c + (cc-c)*w*w 
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n_target = 40 # run up to this value of n 

cost_array = {} # initialize data for "array" 
from math import sqrt 
width_limit = int(2*sqrt(n_target)+5) 
for w in range (0 ,width_limit+2) : 
for cc in range (width_limit , , -1) : 
cost_array [w , cc] ={} 
MAX = n_target**3 # "infinity", trivial upper bound on cost 
opt = (n_target+l) * [MAX] # initialize array for optimal values 

cost_array [0,width_limit] [0,0,0,0,0,0]=0 # starting "town" with no columns 
for w in range(0,width_limit+l) : 
for cc in range (width_limit , , -1) : 
for (D_up_right, D_down_right , D_up_left, D_down_left, 
n_up, n_down) , cost in cost_array [w, cc] . items () : 

D_up_left += n_up # add 1 horizontal unit to all left-distances 
D_down_left += n_down 

for c in range (cc , -1 , -1) : # decrease size c of new column one by one 
n = n_up+n_down + (w+l)*c # (w = previous value of w) 
if n <= n_target: # total number of occupied points so far 
new_cost = cost + ( (D_up_left + D_down_left) * c + 
(n_up + n_down) * c*(c-l)/2 + 
(c+l)*c*(c-l)/6 * (2*w+l) + 
c * c * w*( w +l)/2 ) 
if c==0 : # a completed town 

opt [n] = min( new_cost, opt [n] ) 
else: # store cost of newly constructed partial town 

ind = (D_up_left, D_down_left, D_up_right, D_down_right , 

n_up, n_down) # exchange left and right when storing 
cost_array [w+1 , c] [ind] = min ( new_cost, 
cost_array[w+l, c] .get (ind, MAX) ) 
# decrease c by 1 : 

if (c°/ 2)==l: # remove an element from the top of the leftmost column 
n_up += w 

D_up_left += n_up + w*(w+l)/2 
D_up_right += n_up + w*(w-l)/2 
else: # remove from the bottom 
n_down += w 

D_down_left += n_down + w*(w+l)/2 
D_down_right += n_down + w*(w-l)/2 

for n in ranged, n_target+l) : print n, opt [n] 

Figure .6: Python program for computing optimal n-towns 
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The resulting cost is scaled by a factor of 6, but the end result is then always 
even, so we could divide it by 2 (cf. Tabled) all values c city (n) are multiples 
of 1/3). 

For n = 40, the program takes a few seconds, but for n = 80 it takes hours. 
For larger n the space becomes a more severe bottleneck than the running 
time; thus it is important to release storage when it is no longer needed, for 
example by resetting cost_array [w, cc] ={} after each outer loop. There are 
several possibilities to speed up the program. The cost of some approximately 
circular solution can be taken as an initial upper bound. With this upper 
bound, one can then derive a stronger bound width_limit on the maximum 
height and width by ad-hoc methods. During the calculation, one can also 
prune cost values that are so large that they cannot possibly lead to a better 
solution. The given program computes only the optimum cost. We have 
extended it to also remember the optimal solutions. This program has 133 
lines and was used to produce the data of Table [TJ 
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